Picture for Hanwang Zhang

Hanwang Zhang

On Path to Multimodal Generalist: General-Level and General-Bench

Add code
May 07, 2025
Viaarxiv icon

Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization

Add code
Apr 25, 2025
Viaarxiv icon

Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning

Add code
Apr 22, 2025
Viaarxiv icon

Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens

Add code
Apr 20, 2025
Viaarxiv icon

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Add code
Apr 17, 2025
Viaarxiv icon

Generalized Visual Relation Detection with Diffusion Models

Add code
Apr 16, 2025
Viaarxiv icon

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Add code
Mar 19, 2025
Viaarxiv icon

Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness

Add code
Mar 12, 2025
Viaarxiv icon

Generalized Kullback-Leibler Divergence Loss

Add code
Mar 11, 2025
Viaarxiv icon

Seeing World Dynamics in a Nutshell

Add code
Feb 05, 2025
Figure 1 for Seeing World Dynamics in a Nutshell
Figure 2 for Seeing World Dynamics in a Nutshell
Figure 3 for Seeing World Dynamics in a Nutshell
Figure 4 for Seeing World Dynamics in a Nutshell
Viaarxiv icon